Integration of ASR and machine translation models in a document translation task

نویسندگان

  • Aarthi M. Reddy
  • Richard C. Rose
  • Alain Désilets
چکیده

This paper is concerned with the problem of machine aided human language translation. It addresses a translation scenario where a human translator dictates the spoken language translation of a source language text into an automatic speech dictation system. The source language text in this scenario is also presented to a statistical machine translation system (SMT). The techniques presented in the paper assume that the optimum target language word string which is produced by the dictation system is modeled using the combined SMT and ASR statistical models. These techniques were evaluated on a speech corpus involving human translators dictating English language translations of French language text obtained from transcriptions of the proceedings of the Canadian House of Commons. It will be shown in the paper that the combined ASR/SMT modeling techniques described in the paper were able to reduce ASR WER by 26.6 percent relative to the WER of an ASR system that did not incorporate SMT knowledge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating Knowledge of Source Language Text in a System for Dictation of Document Translations

This paper describes methods for integrating source language and target language information for machine aided human translation (MAHT) of text documents. These methods are applied to a language translation task involving a human translator dictating a first draft translation of a source language document. A method is presented which integrates target language automatic speech recognition (ASR)...

متن کامل

Towards domain independence in machine aided human translation

This paper presents an approach for integrating statistical machine translation and automatic speech recognition for machine aided human translation (MAHT). It is applied to the problem of improving ASR performance for a human translator dictating translations in a target language while reading from a source language document. The approach addresses the issues associated with task independent A...

متن کامل

Integration of Speech to Computer-Assisted Translation Using Finite-State Automata

State-of-the-art computer-assisted translation engines are based on a statistical prediction engine, which interactively provides completions to what a human translator types. The integration of human speech into a computer-assisted system is also a challenging area and is the aim of this paper. So far, only a few methods for integrating statistical machine translation (MT) models with automati...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

On the Translation Quality of Google Translate: With a Concentration on Adjectives

Translation, whose first traces date back at least to 3000 BC (Newmark, 1988), has always been considered time-consuming and labor-consuming. In view of this, experts have made numerous efforts to develop some mechanical systems which can reduce part of this time and labor. The advancement of computers in the second half of the twentieth century paved the ground for the invention of machine tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007